Skip to content

Comments

feat: batch write/read APIs + v0.2.0#2

Merged
wolfiesch merged 2 commits intomainfrom
feat/batch-apis-v0.2.0
Feb 20, 2026
Merged

feat: batch write/read APIs + v0.2.0#2
wolfiesch merged 2 commits intomainfrom
feat/batch-apis-v0.2.0

Conversation

@wolfiesch
Copy link
Collaborator

Summary

  • Adds write_rows() for bulk grid writes at arbitrary positions, bypassing per-cell FFI overhead
  • Adds fast iter_rows(values_only=True) bulk read path via single Rust FFI call (6.7x faster than openpyxl)
  • Adds Rust backend support for write_sheet_values() and read_all_values() in both calamine and xlsxwriter backends
  • Adds rename_sheet() support in Rust writer backend
  • Bumps version from 0.1.2 to 0.2.0 (includes calc engine from PR feat(calc): formula evaluation engine with recursive expression parser #1)

Performance

API vs openpyxl How
ws.append(row) 3.7x faster write Buffers rows, single Rust call at save
ws.write_rows(grid) 3.7x faster write Same mechanism, arbitrary start position
ws.iter_rows(values_only=True) 6.7x faster read Single Rust call, no Cell objects

Test plan

  • Existing 227 tests pass (150 calc + 77 core)
  • New compat tests cover write_rows, bulk iter_rows, sheet rename
  • CI: 6 platform/version combos (ubuntu/macOS/windows x 3.9/3.12)

🤖 Generated with Claude Code

- Add write_rows() for bulk grid writes at arbitrary positions
- Add fast bulk iter_rows(values_only=True) via single Rust FFI call
- Add Rust backend support for write_sheet_values() and read_all_values()
- Add sheet rename_sheet() support in Rust writer backend
- Update README with batch API documentation and SynthGL case study
- Add comprehensive openpyxl compatibility test suite (379 lines)
- Bump version from 0.1.2 to 0.2.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 20, 2026 07:33
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds high-performance batch read/write APIs to wolfxl and bumps the version from 0.1.2 to 0.2.0. The changes introduce three key optimizations: bulk write via write_rows() (3.7x faster), bulk read via iter_rows(values_only=True) (6.7x faster), and sheet renaming support in the Rust backend.

Changes:

  • Added write_rows() API for bulk grid writes at arbitrary positions
  • Added fast path for iter_rows(values_only=True) using single Rust FFI call
  • Added rename_sheet() support in Rust writer backend with automatic title setter sync
  • Bumped version to 0.2.0 and added formulas as optional dependency for calc feature

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
uv.lock Added new dependencies (formulas, numpy, scipy, regex, etc.) for calc extra
pyproject.toml Version bump to 0.2.0 and added optional calc dependencies
Cargo.toml Version bump to 0.2.0
python/wolfxl/_worksheet.py Added write_rows(), bulk iter_rows path, title setter Rust sync, bulk writes buffer
src/rust_xlsxwriter_backend.rs Added rename_sheet() method with comprehensive re-keying of all data structures
src/calamine_styled_backend.rs Added read_sheet_values_plain() for direct Python value conversion
tests/test_wolfxl_compat.py Added 383 new test lines covering rename, bulk read, write_rows, plain read
README.md Added batch API documentation and performance comparison table

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 369 to 378
indiv_from_bulk: list[tuple[int, int, Any]] = []
for ri, row in enumerate(grid):
for ci, val in enumerate(row):
if val is not None and (
isinstance(val, bool)
or (isinstance(val, str) and val.startswith("="))
or not isinstance(val, (int, float, str))
):
indiv_from_bulk.append((sr + ri, sc + ci, val))
row[ci] = None
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a code duplication issue here. Lines 369-378 (bulk writes scan) are nearly identical to lines 346-354 (append buffer scan). Both loops:

  1. Check the same conditions for non-batchable values
  2. Collect indices in a list
  3. Replace with None in the grid

Consider extracting this into a helper method to avoid duplication and improve maintainability. For example:

def _extract_non_batchable(self, grid: list[list[Any]], start_row: int, start_col: int) -> list[tuple[int, int, Any]]:
    """Extract non-batchable values from grid, replacing them with None."""
    indiv: list[tuple[int, int, Any]] = []
    for ri, row in enumerate(grid):
        for ci, val in enumerate(row):
            if val is not None and (
                isinstance(val, bool)
                or (isinstance(val, str) and val.startswith("="))
                or not isinstance(val, (int, float, str))
            ):
                indiv.append((start_row + ri, start_col + ci, val))
                row[ci] = None
    return indiv

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a56f0fc. Extracted _extract_non_batchable() static method that both the append buffer and bulk write scanning now call.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: de8ef7eb78

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

self._append_buffer = []

# -- Flush bulk writes (write_rows) -----------------------------------
for grid, sr, sc in self._bulk_writes:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Flush write_rows queue when saving in modify mode

write_rows() now queues grids into _bulk_writes, but that queue is only consumed in the writer-specific flush loop shown here. In load_workbook(..., modify=True) flows, _flush() uses the patcher path and never drains _bulk_writes, so ws.write_rows(...) changes are silently dropped on save unless each target cell is also written through the dirty-cell path.

Useful? React with 👍 / 👎.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a56f0fc. Added _materialize_bulk_writes() method that converts bulk write buffers into Cell objects. The patcher flush path now calls it (alongside the existing _materialize_append_buffer()) before processing dirty cells. This ensures write_rows() data is flushed in modify mode.

or not isinstance(val, (int, float, str))
):
indiv_from_bulk.append((sr + ri, sc + ci, val))
row[ci] = None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid mutating caller-provided grids during bulk flush

The bulk write flush mutates the original grid in place via row[ci] = None for non-batchable values. Because write_rows() stores the caller's list object directly, this side effect leaks out after save() and can corrupt results when the same grid is reused in multiple write_rows() calls (later writes lose booleans/formulas/dates that were nulled during the first flush).

Useful? React with 👍 / 👎.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a56f0fc. write_rows() now stores a shallow copy of the grid ([list(row) for row in rows]) so that the non-batchable extraction in flush can safely mutate row elements without affecting the caller's original lists.

Comment on lines 198 to 202
let s = ndt.date().format("%Y-%m-%d").to_string();
Ok(s.to_object(py))
} else {
let s = ndt.format("%Y-%m-%dT%H:%M:%S").to_string();
Ok(s.to_object(py))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve date/datetime types in values_only bulk reads

The new plain bulk conversion formats Data::DateTime as strings and returns them directly, so the fast iter_rows(values_only=True) path now yields string timestamps instead of the date/datetime objects produced by the existing payload conversion path. This breaks callers that rely on typed date values for comparisons or arithmetic when bulk reads are enabled.

Useful? React with 👍 / 👎.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a56f0fc. Updated data_to_plain_py() in Rust to return proper Python datetime.date / datetime.datetime objects using PyO3's PyDate::new() / PyDateTime::new() constructors. Also handles DateTimeIso values with proper midnight detection (returns date vs datetime). This matches the typed behavior of the non-bulk _payload_to_python() path.

1. Extract _extract_non_batchable() helper to eliminate code duplication
   between append buffer and bulk write scanning (Copilot).

2. Drain _bulk_writes in modify mode by materializing into Cell objects
   before patcher flush, preventing silent data loss (Codex P1).

3. Copy grids in write_rows() to avoid mutating caller-provided lists
   during flush (Codex P2).

4. Return typed Python date/datetime objects from data_to_plain_py()
   instead of ISO strings, matching the non-bulk read path behavior.
   Also handle DateTimeIso values with proper type detection (Codex P2).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@wolfiesch wolfiesch merged commit 72a4615 into main Feb 20, 2026
7 checks passed
wolfiesch added a commit that referenced this pull request Feb 20, 2026
- POWER: return #NUM! for negative base with fractional exponent
  instead of leaking complex numbers (Copilot #2, Codex #11)
- LEFT: return #VALUE! for negative num_chars (Copilot #1)
- MID: return #VALUE! for start < 1 or num_chars < 0 (Copilot #8)
- range_shape: use min/max instead of abs() for consistency with
  expand_range (Copilot #3)
- Add @_requires_formulas skip marker to formulas-dependent test
  classes so CI passes without wolfxl[calc] extra (Codex #10)
- Add 4 edge case tests: POWER(-1,0.5), LEFT(-1), MID(0,..), MID(.,-1)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant